A Portable InfiniBand Module for MPICH2/Nemesis: Design and Evaluation
نویسندگان
چکیده
With the emergence of multi-core-based processors, it is becoming significantly important to optimize both intra-node and inter-node communication in an MPI stack. MPICH2 group has recently introduced a new Nemesis-based MPI stack which provides highly optimized design for intra-node communication. It also provides modular design for different inter-node networks. Currently, the MPICH2/Nemesis stack has support for TCP/IP and Myrinet only. The TCP/IP interface allows this stack to run on the emerging InfiniBand network with IPoIB support. However, this approach does not deliver good performance and can not exploit the novel mechanisms and features provided by InfiniBand. In this paper, we take on the challenge of designing a portable InfiniBand network module (IB-netmod) for Nemesis. The IB-netmod is designed over the Verbs-level interface of InfiniBand and can take advantage of all features and mechanisms of InfiniBand. A complete design of the IB-netmod with the associated challenges are presented. A comprehensive performance evaluation (micro-benchmarks, collectives and applications) of the new Nemesis-IB design is carried out against the Nemesis TCP/IP (with IPoIB support on InfiniBand) and the native IB support of the MVAPICH2 stack. The new IB-netmod is able to deliver comparable performance to that of the native IB support of MVAPICH2. Compared to the MPICH2/IPoIB support for InfiniBand, the new design is able to deliver significant performance benefits. For NAMD application with 256 cores, the new IB-netmod is able to deliver 4% improvement compared to the latest MVAPICH2 release. To the best of our knowledge, this is the first IB-netmod design for the MPICH2/Nemesis framework. The next release of MVAPICH2 will be having this new IB-netmod support.
منابع مشابه
A uGNI-Based MPICH2 Nemesis Network Module for Cray XE Computer Systems
Recent versions of MPICH2 have featured Nemesis a scalable, high-performance, multi-network communication subsystem. Nemesis provides a framework for developing Network Modules (Netmods) for interfacing the Nemesis subsystem to various high speed network protocols. Cray has developed a User-Level Generic Network Interface (uGNI) for interfacing MPI implementations to the internal high speed net...
متن کاملImplementation and evaluation of shared-memory communication and synchronization operations in MPICH2 using the Nemesis communication subsystem
This paper presents the implementation of MPICH2 over the Nemesis communication subsystem and the evaluation of its shared-memory performance. We describe design issues as well as some of the optimization techniques we employed. We conducted a performance evaluation over shared memory using microbenchmarks. The evaluation shows that MPICH2 Nemesis has very low communication overhead, making it ...
متن کاملImplementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem
This paper presents the implementation of MPICH2 over the Nemesis communication subsystem and the evaluation of its sharedmemory performance. We describe design issues as well as some of the optimization techniques we employed. We conducted a performance evaluation over shared memory using microbenchmarks as well as application benchmarks. The evaluation shows that MPICH2 Nemesis has very low c...
متن کاملImplementation over VAPI on InfiniBand : Challenges , Design Experiences , and Performance Evaluation ( a work - in - progress report , status 07 / 07 / 03 )
More and more clusters are already equipped or planned with InfiniBand as interconnect technology. InfiniBand architecture is an open industry standard [4] that provides modern concepts for high–bandwidth, low–latency, as well as reliability, availability, serviceability (RAS) features. MPICH2 [1], as the successor of one of the most popular open source message passing implementations, aims to ...
متن کاملAn MPICH2 Channel Device Implementation over VAPI on InfiniBand
MPICH2, the successor of one of the most popular open source message passing implementations, aims to fully support the MPI-2 standard. Due to a complete redesign, MPICH2 is also cleaner, more flexible, and faster. The InfiniBand network technology is an open industry standard and provides high bandwidth and low latency, as well as reliability, availability, serviceability (RAS) features. It is...
متن کامل